Architecting Fault-Tolerant Software Systems
نویسنده
چکیده
The increasing size and complexity of software systems makes it hard to prevent orremove all possible faults. Faults that remain in the system can eventually lead toa system failure. Fault tolerance techniques are introduced for enabling systems torecover and continue operation when they are subject to faults. Many fault tolerancetechniques are available but incorporating them in a system is not always trivial. Weconsider the following problems in designing a fault-tolerant system. First, existingreliability analysis techniques generally do not prioritize potential failures from theend-user perspective and accordingly do not identify sensitivity points of a system.Second, existing architecture styles are not well-suited for specifying, communicatingand analyzing design decisions that are particularly related to the fault-tolerantaspects of a system. Third, there are no adequate analysis techniques that evaluatethe impact of fault tolerance techniques on the functional decomposition of softwarearchitecture. Fourth, realizing a fault-tolerant design usually requires a substantialdevelopment and maintenance effort.To tackle the first problem, we propose a scenario-based software architecture reli-ability analysis method, called SARAH that benefits from mature reliability engi-neering techniques (i.e. FMEA, FTA) to provide an early reliability analysis of thesoftware architecture design. SARAH evaluates potential failures from the end-userperspective to identify sensitive points of a system without requiring an implemen-tation.As a new architectural style, we introduce Recovery Style for specifying fault-tolerantaspects of software architecture. Recovery Style is used for communicating andanalyzing architectural design decisions and for supporting detailed design withrespect to recovery.As a solution for the third problem, we propose a systematic method for optimizingthe decomposition of software architecture for local recovery, which is an effectivefault tolerance technique to attain high system availability. To support the method,we have developed an integrated set of tools that employ optimization techniques,state-based analytical models (i.e. CTMCs) and dynamic analysis on the system.
منابع مشابه
Computing Science Architecting Fault Tolerant Systems Architecting Fault Tolerant Systems Bibliographical Details about the Author Computing Science Architecting Fault Tolerant Systems Architecting Fault Tolerant Systems Bibliographical Details about the Author Suggested Keywords Architecting Fault Tolerant Systems
As building trustworthy (dependable) systems is one of the major challenges faced by software developers, dealing with various threats (such as errors, faults and failures) is becoming one of the main foci of software and system research and development. In the core of ensuring system dependability is acceptance of the fact that errors always happen in spite of all the efforts to eliminate faul...
متن کاملArchitecting Fault-tolerant Component-based Systems: from requirements to testing
Fault tolerance is one of the most important means to avoid service failure in the presence of faults, so to guarantee they will not interrupt the service delivery. Software testing, instead, is one of the major fault removal techniques, realized in order to detect and remove software faults during software development so that they will not be present in the final product. This paper shows how ...
متن کاملSpecification-Driven Prototyping for Architecting Dependability
This paper describes a major part of an architecting methodology developed for safety-critical fault-tolerant software systems. The methodology coverage centers on specificationdriven prototyping. This approach to prototyping is seen to be superior to the customary approaches of throwaway and evolutionary prototyping. A still developmental form of representation, higher-level statecharts, provi...
متن کاملTowards Systematic Design of Adaptive Fault Tolerant Systems
The development of modern distributed software systems poses a significant engineering challenge. The system architecture should exhibit plasticity and high degree of reconfigurability to enable an automated adaptation to continuously changing operating conditions and component failures. Traditional engineering approaches are inefficient to cope with complexity of such systems to ensure their r...
متن کاملWorkshop on Architecting Dependable Systems
In comparison with the state of the art in the field of Web Services architectures and their composition, we propose to exploit the concept of CA Actions to enable to dependable composition of Web Services. CA Actions introduce a mechanism for structuring fault tolerant concurrent systems through the generalization of the concepts of atomic actions and transactions, and are adapted to the compo...
متن کامل